Skip to main content

View Post [edit]

Poster: Albretch Date: Dec 1, 2020 6:51am
Forum: texts Subject: approximately total number of published texts and which percentage has been so far digitized ...

I have heard statements by google, microsoft, ... about how much text is apparently accessible over the Internet.

There are also estimates of the total number of books ever published: 129,864,880, which is not a large number at all.

Can anyone here answer or point me to a reliable source about such info?:

* is there a registry of the titles and other publication metadata about those books per language

* which of those books have been actually read by people socially over generations?

* total amount or percentage of of those books which have been digitized

* from those books which have been digitized, which ones have been converted to text in a faithful way.

When you try to get the text version of many of the pdf texts right here at you realized that the conversion was based on some tesseract kind of automation, so the quality of the texts is not that good (the least to say)